The Armada framework for parallel I/O on computational grids
نویسندگان
چکیده
An exciting trend in high-performance distributed computing is the development of widely-distributed networks of heterogeneous systems and devices, known as computational grids. Grid applications use high-speed networks to logically assemble collections of resources such as scientific instruments, supercomputers, databases, and so forth. One important challenge facing grid computing is efficient parallel I/O for data-intensive grid applications. Data-intensive grid applications are particularly challenging because they require access to large (terabyte-petabyte) remote data sets and often have computational requirements that can only be met by high-performance supercomputers. In addition, data is often stored in “raw” formats and requires significant preprocessing or filtering before the computation can take place. Such applications exist in seismic processing, climate modeling, physics, astronomy, biology, chemistry, and visualization. In this report, we present the Armada framework [OK01] for building I/O-access paths for data-intensive grid applications. We designed Armada to allow grid applications to efficiently access data sets distributed across a computational grid, and in particular to allow the application programmer and the dataset provider to design and deploy a flexible network of application-specific and dataset-specific functionality across the grid. Using the Armada framework, grid applications access remote data sets by sending data requests through a graph of distributed application objects. The graph is called an “armada” and the objects are called “ships”. Figure 1 shows a simple armada for an application accessing applying a preprocessing operator to a distributed data set. We expect most applications to access data through existing armadas constructed by a data set provider; however, it is also possible for the application to extend existing armadas with applicationspecific functionality or to construct entire armadas from scratch. The armada encodes the programmer’s interface, data layout, caching and prefetching policies, interfaces to heterogeneous data servers, and most other functionality provided by an I/O system. The application sees an armada as an object providing access to a specific type of data through a high-level interface. One use of Armada, for example, is to construct complicated data sets on top of legacy files and databases. API storage
منابع مشابه
Armada: a parallel I/O framework for computational grids
High-performance computing increasingly occurs on “computational grids” composed of heterogeneous and geographically distributed systems of computers, networks, and storage devices that collectively act as a single “virtual” computer. One of the great challenges for this environment is to provide efficient access to data that is distributed across remote data servers in a grid. In this paper, w...
متن کاملArmada: A Parallel File System for Computational Grids
High-performance distributed computing appears to be shifting away from tightly-connected supercomputers to “computational grids” composed of heterogeneous systems of networks, computers, storage devices, and various other devices that collectively act as a single geographically distributed “virtual” computer. One of the great challenges for this environment is providing efficient parallel data...
متن کاملTowards autonomic application-sensitive partitioning for SAMR applications
Distributed structured adaptive mesh refinement (SAMR) techniques offer the potential for accurate and cost-effective solutions of physically realistic models of complex physical phenomena. However, the heterogeneous and dynamic nature of SAMR applications results in significant runtime management challenges. This paper investigates autonomic application-sensitive SAMR runtime management strate...
متن کاملTask Scheduling Using Particle Swarm Optimization Algorithm with a Selection Guide and a Measure of Uniformity for Computational Grids
In this paper, we proposed an algorithm for solving the problem of task scheduling using particle swarm optimization algorithm, with changes in the Selection and removing the guide and also using the technique to get away from the bad, to move away from local extreme and diversity. Scheduling algorithms play an important role in grid computing, parallel tasks Scheduling and sending them to ...
متن کاملAccelerating high-order WENO schemes using two heterogeneous GPUs
A double-GPU code is developed to accelerate WENO schemes. The test problem is a compressible viscous flow. The convective terms are discretized using third- to ninth-order WENO schemes and the viscous terms are discretized by the standard fourth-order central scheme. The code written in CUDA programming language is developed by modifying a single-GPU code. The OpenMP library is used for parall...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002